Reinforcement Learning with Parameterized Actions
نویسندگان
چکیده
We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions—discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goalscoring and Platform domains.
منابع مشابه
Active exploration in parameterized reinforcement learning
Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose a...
متن کاملThompson Sampling for Learning Parameterized Markov Decision Processes
We consider reinforcement learning in parameterized Markov Decision Processes (MDPs), where the parameterization may induce correlation across transition probabilities or rewards. Consequently, observing a particular state transition might yield useful information about other, unobserved, parts of the MDP. We present a version of Thompson sampling for parameterized reinforcement learning proble...
متن کاملAdaptive agents on evolving networks
In this work we study the learning dynamics for agents playing games on networks. We propose a model of network formation in repeated games where players strategically adopt actions and connections simultaneously using a reinforcement learning scheme which is called Boltzmann-Q-learning. This adaptation scheme in the continuous time limit has a proven relation to the evolutionary game theory th...
متن کاملCross-Entropy Method for Reinforcement Learning
Reinforcement Learning methods have been succesfully applied to various optimalization problems. Scaling this up to real world sized problems has however been more of a problem. In this research we apply Reinforcement Learning to the game of Tetris which has a very large state space. We not only try to learn policies for Standard Tetris but try to learn parameterized policies for Generalized Te...
متن کاملReinforcement Learning for Parameterized Motor Primitives [IJCNN1759]
One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the “building blocks of movement generation”, called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been mad...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016